8,184 research outputs found
Unity in diversity : integrating differing linguistic data in TUSNELDA
This paper describes the creation and preparation of TUSNELDA, a collection of corpus data built for linguistic research. This collection contains a number of linguistically annotated corpora which differ in various aspects such as language, text sorts / data types, encoded annotation levels, and linguistic theories underlying the annotation. The paper focuses on this variation on the one hand and the way how these heterogeneous data are integrated into one resource on the other hand
Evaluating POS tagging under sub-optimal conditions : or: does meticulousness pay?
In this paper, we investigate the role of sub-optimality in training data for part-of-speech tagging. In particular, we examine to what extent the size of the training corpus and certain types of errors in it affect the performance of the tagger. We distinguish four types of errors: If a word is assigned a wrong tag, this tag can belong to the ambiguity class of the word (i.e. to the set of possible tags for that word) or not; furthermore, the major syntactic category (e.g. "N" or "V") can be correctly assigned (e.g. if a finite verb is classified as an infinitive) or not (e.g. if a verb is classified as a noun). We empirically explore the decrease of performance that each of these error types causes for different sizes of the training set. Our results show that those types of errors that are easier to eliminate have a particularly negative effect on the performance. Thus, it is worthwhile concentrating on the elimination of these types of errors, especially if the training corpus is large
The TUSNELDA annotation standard : an XML encoding standard for multilingual corpora supporting various aspects of linguistic research
This paper proposes a corpus encoding standard that meets the needs of linguistic research using a variety of linguistic data structures. The standard was developed in SFB 441, a research project at the University of Tuebingen. The principal concern of SFB 441 are the empirical data structures which feed into linguistic theory building. SFB 441 consists of several projects, most of which are building corpora to empirically investigate various linguistic phenomena in various languages (e.g. modal verbs in German, forms of address and politeness in Russian). These corpora will form the components of the "Tuebingen collection of reusable, empirical, linguistic data structures (TUSNELDA)". The TUSNELDA annotation standard aims at providing a uniform encoding scheme for all subcorpora and texts of TUSNELDA such that they can be processed with uniform standardized tools. To guarantee maximal reusability we use XML for encoding. Previous SGML standards for text encoding were provided by the Text Encoding Initiative (TEI) and the Expert Advisory Group on Language Engineering Standards (Corpus Encoding Standard, CES). The TUSNELDA standard is based on TEI and XCES (XML version of CES) but takes into account the specific needs of the SFB projects, i.e. the peculiarities of the examined languages and linguistic phenomena
Models for the two-phase flow of concentrated suspensions
A new two-phase model for concentrated suspensions is derived that
incorporates a constitutive law combining the rheology for non-Brownian
suspension and granular flow. The resulting model exhibits a yield-stress
behavior for the solid phase depending on the collision pressure. This property
is investigated for the simple geometry of plane Poiseuille flow, where an
unyielded or jammed zone of finite width arises in the center of the channel.
For the steady states of this problem, the governing equations are reduced to a
boundary value problem for a system of ordinary differential equations and the
conditions for existence of solutions with jammed regions are investigated
using phase-space methods. For the general time-dependent case a new drift-flux
model is derived using matched asymptotic expansions that takes into account
the boundary layers at the walls and the interface between the yielded and
unyielded region. The drift-flux model is used to numerically study the dynamic
behavior of the suspension flow including the appearance and evolution of an
unyielded or jammed region
Localized Instabilities and Spinodal Decomposition in Driven Systems in the Presence of Elasticity
We study numerically and analytically the instabilities associated with phase
separation in a solid layer on which an external material ux is imposed. The
first instability is localized within a boundary layer at the exposed free
surface by a process akin to spinodal decomposition. In the limiting static
case, when there is no material ux, the coherent spinodal decomposition is
recovered. In the present problem stability analysis of the time-dependent and
non-uniform base states as well as numerical simulations of the full governing
equations are used to establish the dependence of the wavelength and onset of
the instability on parameter settings and its transient nature as the patterns
eventually coarsen into a at moving front. The second instability is related to
the Mullins- Sekerka instability in the presence of elasticity and arises at
the moving front between the two phases when the ux is reversed. Stability
analyses of the full model and the corresponding sharp-interface model are
carried out and compared. Our results demonstrate how interface and bulk
instabilities can be analysed within the same framework which allows to
identify and distinguish each of them clearly. The relevance for a detailed
understanding of both instabilities and their interconnections in a realistic
setting are demonstrated for a system of equations modelling the
lithiation/delithiation processes within the context of Lithium ion batteries.Comment: 8 figures, 19 page
Thin film models for active gels
In this study we present a free-boundary problem for an active liquid crystal
based on the Beris-Edwards theory that uses a tensorial order parameter and
includes active contributions to the stress tensor to analyse the rich defect
structure observed in applications such as the Adenosinetriphosphate (ATP)
driven motion of a thin film of an actin filament network. The small aspect
ratio of the film geometry allows for an asymptotic approximation of the
free-boundary problem in the limit of weak elasticity of the network and strong
active terms. The new thin film model captures the defect dynamcs in the bulk
as well as wall defects and thus presents a significant extension of previous
models based on the Lesli-Erickson-Parodi theory. Analytic expression are
derived that reveal the interplay of anchoring conditions, film thickness and
active terms and their control of transitions of flow structure.Comment: 33 pages, 3 figure
Boundary-induced phase transitions in a space-continuous traffic model with non-unique flow-density relation
The Krauss-model is a stochastic model for traffic flow which is continuous
in space. For periodic boundary conditions it is well understood and known to
display a non-unique flow-density relation (fundamental diagram) for certain
densities. In many applications, however, the behaviour under open boundary
conditions plays a crucial role.In contrast to all models investigated so far,
the high flow states of the Krauss-model are not metastable, but also stable.
Nevertheless we find that the current in open systems obeys an extremal
principle introduced for the case of simpler discrete models. The phase diagram
of the open system will be completely determined by the fundamental diagram of
the periodic system through this principle. In order to allow the investigation
of the whole state space of the Krauss-model, appropriate strategies for the
injection of cars into the system are needed.Two methods solving this problem
are discussed and the boundary-induced phase transitions for both methods are
studied.We also suggest a supplementary rule for the extremal principle to
account for cases where not all the possible bulk states are generated by the
chosen boundary conditions.Comment: 12 Pages, 14 figure
- …